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Abstract 

Background: Metabolic network reconstructions formalize our knowledge of metabolism. Gaps in these networks 
pinpoint regions of metabolism where biological components and functions are "missing." At the same time, a 
major challenge in the post genomic era involves characterisation of missing biological components to complete 
genome annotation. 

Results: We used the human metabolic network reconstruction RECON 1 and established constraint-based 
modelling tools to uncover novel functions associated with human metabolism. Flux variability analysis identified 
175 gaps in RECON 1 in the form of blocked reactions. These gaps were unevenly distributed within metabolic 
pathways but primarily found in the cytosol and often caused by compounds whose metabolic fate, rather than 
production, is unknown. Using a published algorithm, we computed gap-filling solutions comprised of non- 
organism specific metabolic reactions capable of bridging the identified gaps. These candidate solutions were 
found to be dependent upon the reaction environment of the blocked reaction. Importantly, we showed that 
automatically generated solutions could produce biologically realistic hypotheses of novel human metabolic 
reactions such as of the fate of iduronic acid following glycan degradation and of N-acetylglutamate in amino acid 
metabolism. 

Conclusions: The results demonstrate how metabolic models can be utilised to direct hypotheses of novel 
metabolic functions in human metabolism; a process that we find is heavily reliant upon manual curation and 
biochemical insight. The effectiveness of a systems approach for novel biochemical pathway discovery in mammals 
is demonstrated and steps required to tailor future gap filling algorithms to mammalian metabolic networks are 
proposed. 




Background 

An in silico model of a genome-scale metabolic network 
reconstruction is based upon a biochemically, genetically 
and genomically (BiGG) structured knowledge base 
[1,2]. It is subject to research that, in many cases, entails 
predicting an organism's phenotypic response to gene 
deletions and/or environmental perturbations in silico. 
These properties have resulted in widespread applica- 
tions of metabolic models in microbial bioengineering, 
contextualisation of high-throughput data, and biochem- 
ical pathway discovery [2,3]. While the number of 
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microbial genome-scale metabolic networks has 
increased exponentially over the past 10 years [4], fewer 
have been reconstructed for higher eukaryotes as their 
inherent complexity results in larger and more complex 
models which are harder to experimentally validate 
[5,6]. To date, only mouse [7-9], bovine [10], and 
human [11-13] genome-scale metabolic networks have 
been reconstructed. These latter ones have successfully 
been applied to systems driven eukaryotic metabolic 
research. For example, the human genome-scale meta- 
bolic network RECON 1 has been used to reveal tran- 
scriptional regulatory signatures of type 2 diabetes [14], 
to create tissue-specific models [15], to predict drug-off 
target effects [16], and to simulate cell specific metabolic 
changes upon pathogen infection [17]. Their potential to 
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discover novel metabolic functions has however not 
been demonstrated. 

RECON 1 accounts for 1496 ORFs, 2004 proteins, 
2766 metabolites, and 3311 metabolic reactions. It 
represents one of the most comprehensive biological 
network reconstructions published to date [11]. RECON 
1 is bipartite in nature. First, it is a BiGG knowledge 
base containing information on the components of 
human metabolism, mined by extensive literature review 
[4] and accessible online at http://bigg.ucsd.edu [18]. Sec- 
ond, it is a mathematical model in which reactions and 
their metabolites are presented in matrix format using 
metabolite stoichiometries. RECON 1 is therefore also a 
stoichiometric matrix amenable to constraint-based ana- 
lysis using linear programming as the reaction stoichio- 
metries impose constraints on the flow of metabolites 
through the network. As an additional constraint, a 
quasi steady state is assumed, meaning that the total 
amount of any metabolite being produced must equal 
the total amount of that metabolite being consumed. 
Using constraint-based analysis, the flow of metabolites 
through the network reactions can be calculated. In 



addition, and perhaps most importantly, reactions can 
be identified which cannot carry flux due to metabolites 
being either only produced or consumed in the network 
[4,19]. These latter reactions and metabolites represent 
gaps in the metabolic network and are referred to as 
blocked reactions and dead-end metabolites, respec- 
tively. Dead-end metabolites can result in multiple 
blocked reactions, referred to as blocked reaction cas- 
cades. Consequently they have been further charac- 
terised as root-no production or root-no consumption 
metabolites to distinguish them from other blocked 
metabolites within the blocked cascade [20] (Figure la). 
In this manner RECON 1 can be utilised to pinpoint 
parts of metabolic pathways where knowledge is incom- 
plete. A recent investigation highlighted that novel bio- 
chemical reactions can still be uncovered in the central 
metabolism of E. coli by employing a systems approach 
[21]. Considering that 30-50% of all known enzyme 
activities are not associated with genes [22-24] and over 
50% of all genes in higher organisms are not associated 
with protein function [25], it appears that many human 
biological processes have yet to be discovered. Enzymes 
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Figure 1 Automated gap filling of RECON 1 using the SMILEY algorithm. A) A simplified metabolic network. Reactions that are able to carry 
flux are shown in blue. Reactions unable to carry flux (red) are blocked and are caused by a root no-consumption metabolite (a in 1) and a root 
no-production metabolite (b in 2). Dead end metabolites can cause multiple blocked reactions refferred to as a cascade of blocked reactions. 
Reactions 2 and 3 occur in a blocked cascade caused by b. Note that c is a blocked intermediate but not a root-no production metabolite. B) 
The SMILEY algorithm identified reactions in the metabolic reaction matrix S (e.g. RECON 1) that were unable to carry flux under steady state 
conditions and then computed resolving reactions found in either U or X that needed to be added to S in order to restore flux through the 
blocked reaction. C) SMILEY solutions were categorised based on the resolving reactions required to restore flux. A category I reversal solution, if 
added to the network shown in A, only restores flux through reaction 1. The category II solution, addition of a novel metabolic reaction, restores 
flux through reactions 1, 2 and 3. The category III transport solution restores flux through reactions 2 and 3 only. SMILEY can suggest multiple 
solutions for a blocked reaction. 
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with no known genes (e.g. orphan enzymes) and genes 
of unknown function underUe metaboUc network gaps 
that manifest themselves as blocked reactions and dead- 
end metabolites in RECON 1. As the function and/or 
substrate specificity of many known human enzymes has 
not been fully elucidated, these similarly serve as candi- 
dates for filling network gaps. 

Multiple computational algorithms have been devised 
to address gaps in metabolic network models in an 
automated manner [26]. These methods utilise meta- 
bolic network analysis, such as flux variability analysis 
(FVA) [27], in order to identify network gaps, alongside 
comparative genomics/metabolomics to suggest candi- 
date metabolic reactions capable of restoring flux 
through the blocked reaction and/or dead-end metabo- 
lite. Experimental data, such as organism growth profiles 
[28,29] or metabolic flux data [30] can be integrated 
with metabolic models in order to highlight model gaps, 
although this is not always a requirement [26]. One 
such algorithm is the SMILEY algorithm, which has 
been successfully used to uncover novel metabolic func- 
tions required to explain discrepancies between the 
observed and model predicted growth phenotypes of E. 
coli on various substrates [28]. 

Because of the difference between humans and micro- 
organisms in terms of the number of related organisms 
for which biological, biochemical, and genetic informa- 
tion exists, it was not known whether an automated 
approach to gap-filling of RECON 1 could yield biologi- 
cally plausible hypotheses. Here we investigated the 
potential of RECON 1 for discovery of novel reactions 
involved in human metabolism. We used FVA to iden- 
tify dead-end metabolites and blocked reactions in 
RECON 1 and SMILEY to propose reactions capable of 
restoring flux through the identified dead-end metabo- 
lites. We then characterised the metabolic pathway dis- 
tribution of identified gaps and their solution types in 
order to get an idea of how gaps are distributed within 
RECON 1 and in what manner these gaps are resolved 
by SMILEY. Finally we validated the automatically gen- 
erated metabolic reaction hypotheses by manual litera- 
ture review in order to assess the biological relevance of 
the proposed solutions with the goal of identifying suita- 
ble experimental targets. 

Results 

In this study we first identified blocked reactions and 
dead-end metabolites occurring in RECON 1 using FVA 
[27]. We then employed SMILEY [28] to compute reac- 
tions that could be added to the RECON 1 (S) from 
universal reaction databases (U, X) to enable flux 
through a blocked reaction (Figure lb). The matrix U 
was compiled from an extensive list of known metabolic 
reactions obtained from the KEGG database [31] while 



the matrix X contained transport reactions in and out 
of the system for every metabolite contained within S 
and U. SMILEY therefore proposed what reactions, irre- 
spective of organism, needed to be added to RECON 1 
in order to fill a network gap. If none were identified, 
SMILEY suggested transport of the dead-end metabolite 
into or out of the cell. Up to twenty solutions were sug- 
gested for each knowledge gap and each solution could 
be composed of multiple resolving reactions. Note that 
the number of solutions returned by SMILEY is user 
defined. Inspired by Satish Kumar et al. [20] we split the 
computed solutions into three categories based upon 
whether the complete solution involved a reversal of the 
directionality of the blocked reaction, addition of novel 
reaction(s), or addition of a transport reaction, defined 
as category I-III type solutions respectively (Figure Ic). 
The next four sections deal with the characterisation 
and metabolic pathway distribution of the gaps that we 
addressed in RECON 1 while the remainder of the 
results chapter reports analysis of SMILEY solutions and 
specific case studies. 

Gap analysis basis 

We identified 175 blocked reactions in 80 reaction cas- 
cades that were caused by 109 dead-end metabolites 
found in RECON 1. These numbers corresponds to 5% 
and 4% of the total number of reactions and metabolites 
accounted for in RECON 1, respectively. We observed 
that over half of the blocked reactions were caused by 
root no-consumption metabolites while roughly a quar- 
ter were due to root no-production metabolites with the 
remainder caused by both types of dead-end metabolites 
(Figure 2a). Reactions that have both types of dead-end 
metabolites represent metabolic reactions which are 
entirely uncoupled from the metabolic network, and 
whose effect on global metabolism can therefore not be 
described accurately by the metabolic model. 

Sub-cellular distribution of gaps 

The sub-cellular distribution of the dead-end metabo- 
lites and blocked reactions showed that the majority 
were found in the cytosol with the remainder distributed 
within the various cellular compartments, most notably 
in the lysosome, mitochondria, and peroxisome (Figure 
2b). This observation agrees with the distribution of 
dead-end metabolites found in the eukaryotic metabolic 
reconstruction iND750 for Saccharomyces cerevisiae 
[20,32]. Because some of the dead-end metabolites were 
responsible for multiple blocked reactions, which them- 
selves could be associated with both types of dead-end 
metabolite, there was not a direct relationship between 
the number of dead-end metabolites and the number of 
blocked reactions within cellular compartments. In addi- 
tion, blocked reactions can take part in metabolic 
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Figure 2 Characterisation of blocked reactions in RECON 1. A) Classification of blocked reactions in RECON 1 depending on their causative 

dead-end metabolite. B) The cellular distribution of dead-end metabolites, blocked reactions, and blocked reaction cascades within cellular 

compartments accounted for in RECON 1. C) The metabolic pathway distribution of the 175 blocked reactions investigated. D) The distribution 

of BiGG database confidence scores for the blocked reactions investigated. 3 = biochemical and or genetic evidence, 2 = physiological evidence 

or evidence from a nonhuman mammalian cell, 1 = modelling evidence, 0 = unevaluated. 
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subsystems/reaction cascades that span more than one 
cellular compartment. For example, in the mitochondria 
we found 24 blocked reactions participating in six dis- 
tinct reaction cascades and ten dead-end metabolites. 
However, only seven of the dead-ends were responsible 
for the 24 blocked reactions while the remaining three 
metabolites caused blocked reactions in the cytosol. Gap 
distributions can be better understood by looking at the 
distribution of blocked reaction cascades caused by the 
dead-end metabolites. These data suggest that the num- 
ber of knowledge gaps in cellular compartments other 
than the cytosol is around two to five knowledge gaps 
per compartment (Figure 2). 

Metabolic pathway distribution of gaps 

The metabolic pathway distribution of the blocked reac- 
tions is shown in Figure 2C and reflects their sub-cellu- 
lar distribution. All of the metabolic pathways shown 
had blocked reactions in the cytosol. Most of the 
blocked reactions were observed in amino acid metabo- 
lism with the remainder distributed within the metabolic 
pathways shown. Of those involved in amino acid meta- 
bolism, nearly half were part of multiple reaction cas- 
cades in tryptophan metabolism (Additional file 1). 



These reactions were blocked due to root no-consump- 
tion heterocyclic derivatives of tryptophan, such as 
anthranilate, kynurenic acid, 5-methoxy-indole acetate, 
and more, suggesting missing information on the meta- 
bolic fate of these compounds. Apart from knowledge 
gaps in tryptophan metabolism, blocked reactions were 
also identified in metabolic subsystems associated with 
eleven of the twenty most common amino acids. These 
gaps were due to single blocked reactions or reaction 
cascades implying that current information concerning 
the metabolism of these amino acids is fairly complete. 

Multiple blocked reactions were also observed in gly- 
can biosynthesis, the metabolism of cofactors/vitamins 
and lipid metabolism. As opposed to those observed in 
amino acid metabolism, these blocked reactions were 
involved in relatively few reaction cascades. Many of the 
blocked reactions in glycan metabolism, for example, 
were part of two reaction cascades involved in the 
degradation of heparan sulfate and dermatan sulfate in 
the lysosome, respectively. The root no-consumption 
metabolite glucose- 1, 3-mannose and derivatives thereof 
were also the cause of several blocked reactions in gly- 
can biosynthesis. Many blocked reactions involved in 
the metabolism of cofactors/vitamins took place in the 
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mitochondria due to just two root no-consumption 
metabolites causing gaps in the biosynthesis of ubiqui- 
none and vitamin D. Similarly, multiple reactions in 
lipid metabolism were blocked in glycerophospholipid 
biosynthesis in the cytosol due to the root no-consump- 
tion metabolite plasmalogen in the cytosol In total, 32% 
of the knowledge gaps investigated were associated with 
two or more reactions. Finding a solution to the dead- 
end metabolites in these reaction cascades could there- 
fore result in the connection of multiple blocked reac- 
tions back into the metabolic network. The majority of 
blocked reactions were not, however, part of reaction 
cascades (Additional file 2). 

Knowledge status of reactions causing gaps 

We investigated whether the gaps addressed in this 
study were correlated with a lack of information avail- 
able for each blocked reaction by assessing their confi- 
dence scores. Each reaction in RECON 1 has an 
assigned confidence score that allows the experimental 
evidence underlying the reaction to be quickly assessed 
[11]. Figure 2D shows the distribution of confidence 
scores of the reactions investigated. Roughly two thirds 
of the blocked reactions were supported by biochemical 
or physiological evidence, which is similar to what is 
observed for all reactions contained in RECON 1 [11]. 
This distribution implies that the addressed knowledge 
gaps are not simply due to ill-defined metabolic reac- 
tions. Rather, the fate of the participating metabolites 
within the metabolic network or how they contribute to 
human metabolism is not known. In order to suggest 
plausible hypotheses of how this might take place, we 
investigated whether the blocked reactions could be cir- 
cumvented or connected back into the RECON 1 using 
reactions found in the KEGG database [31] in an auto- 
mated manner by running the SMILEY algorithm [28]. 

Solutions to a blocked reaction are dependent upon the 
robustness of its metabolic network 

The SMILEY algorithm suggested up to twenty solutions 
for each blocked reaction. Each solution fell into one of 
the three category types outlined in Figure 1. We first 
investigated how trivial it was to bypass the blocked reac- 
tions using reactions from the KEGG database (U 
matrix). Nearly half of the blocked reactions had more 
than one solution indicating that flux could be restored 
through these reactions in multiple ways (Figure 3a). 
Underlying this is that the more robust the non-organism 
specific reaction environment of the blocked reaction is, 
the more likely that the blocked reaction will have multi- 
ple solutions as there will be more known reactions cap- 
able of acting upon the blocked reaction components. 
For example, a blocked reaction caused by a common, 
well-characterized metabolite that is a component of 
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Figure 3 Characterisation of gap filling solutions proposed by 
SMILEY. A) The number of SMILEY solutions proposed for each 
blocked reaction was not even, suggesting that it is easier to 
incorporate some dead-end metabolites into RECON 1 over others. 
Approximately 49% of the blocked reactions had a single proposed 
solution while 29% could be circumvented in a highly dynamic 
manner with twenty proposed solutions. B) 85% of the blocked 
reactions had SMILEY solutions composed of less than three 
resolving reactions. 



multiple metabolic reactions will have multiple SMILEY 
solutions while the opposite is true for blocked reactions 
caused by less well characterized metabolites. The num- 
ber of solutions obtained for a blocked reaction was inde- 
pendent of the metabolic pathway (Additional file 3). 
Blocked reactions in RECON 1 are therefore not caused 
only by dead-end metabolites whose global metabolic 
role is unknown, rather metabolites and reactions can be 
well characterized metabolic components of other organ- 
isms while there fate is not known in humans. The result 
complements our earlier observation that knowledge 
gaps are found in all metabolic pathways, independent of 
the confidence scores of the reactions making up the par- 
ticular pathway. 

SMILEY solutions involve few resolving reactions 

Inspection of the SMILEY output made it clear that 
assessing each of the 1335 SMILEY solutions repre- 
sented a time consuming task. Our focus was to assess 
whether SMILEY could generate biologically relevant 
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hypotheses of missing reactions in human metaboUsm 
rather than produce a detailed list of missing enzyme 
functionalities. We therefore decided to focus on the 
SMILEY solutions containing the least number of resol- 
ving reactions for each of the 175 blocked reactions 
hereafter referred to as SI solutions. 

Analysis of SI solutions showed that flux could be 
restored through all the blocked metabolites by incor- 
poration of relatively few resolving reactions. Interest- 
ingly, 85% of the blocked reactions were circumvented 
with just one or two resolving reactions (Figure 3b). We 
observed that this was due to the category of the SMI- 
LEY solutions (Figure 4). 74% of the blocked reactions 
had a category I or III solution, which in the majority of 
cases involved at most two resolving reactions. The 
remaining blocked reactions had category II solutions 
where novel reaction(s) had to be added in order to 
incorporate the blocked metabolite. 

Mindful that we might overlook plausible solutions for 
blocked reactions having multiple SMILEY solutions by 
focusing entirely on SI solutions, we investigated the 
categories of the alternative solutions. We found that, 
when available, alternative solutions were either of the 
same solution category as the SI solution or were cate- 
gory III transport solutions (Additional file 4). This 
result indicated that the SI solutions directly reflect the 
solution categories available for each blocked reaction 
although the resolving reactions comprising the solu- 
tions were different. 

Blocked reactions have different SI solutions depending 
on metabolic origin 

Roughly one third of all the blocked reactions were 
resolved with category I solutions (Figure 4). Nearly half 
of these applied to blocked reactions involved in amino 
acid metabolism (Figure 4B), many of which were part 
of tryptophan and lysine metabolism in the cytosol. 
With respect to their metabolic origin, up to 70% of the 
total number of blocked reactions originating from a 
particular metabolic pathway had category I solutions 
(Additional file 5). Notably, none of the blocked reac- 
tions involved in glycan biosynthesis and metabolism 
had category I solutions. Instead, these reactions were 
primarily resolved by category III transport solutions. 
These applied to 90% of the blocked reactions involved 
in glycan biosynthesis and metabolism, more specifically 
to those in chondroitin sulphate and heparan sulphate 
degradation in the lysosome and N-glycan biosynthesis 
in the golgi apparatus. Apart from glycan metabolism, 
category III solutions applied to multiple blocked reac- 
tions involved in the metabolism of cofactors and vita- 
mins and in lipid metabolism. Few blocked reactions 
within amino acid metabolism and the biosynthesis of 
secondary metabolites had category III solutions. 



Category II solutions were the least common solution. 
With respect to their metabolic origin, 10-50% of the 
blocked reactions had this solution type (Additional file 

5) . Many of these applied to blocked reactions taking 
part in various metabolic subsystems of amino acid 
metabolism such as tyrosine metabolism and urea cycle 
metabolism. Multiple blocked reactions in carbohydrate 
metabolism and fatty acid metabolism also had category 
II solutions. 

Validation of SMILEY solutions 

Category I and III solutions were only generated when 
no other solution was possible and therefore the blocked 
reactions that had these solutions contain metabolites 
whose global metabolic role has not been defined. This 
is however not a direct indicator of whether or not 
these solutions are biologically plausible. Category I 
reversal solutions to blocked reactions were validated by 
i) quantitative assignment of blocked reaction direction- 
alities based on estimates of their Gibbs free energy 
changes [33] and ii) a query of experimentally reported 
reaction directionalities from the Brenda database [34]. 
Approximately 50% of the blocked reactions were calcu- 
lated to be reversible while 23% were reported reversible 
in the Brenda database (Figure 4C and Additional file 

6) . Only four reactions were reported reversible by both 
methods which is due to the high number of the 
blocked reactions having unspecified reaction direction- 
alities in the Brenda database (67%) and the uncertain- 
ties associated with reaction Gibbs free energy 
estimation due to unknown in vivo concentrations of 
the partaking metabolites [35]. These validation methods 
do not exclude reactions as reversible or not. Ultimately 
detailed inspection of each gap solution is required. 
Figure 5 shows the cytosolic reaction ADPMAN, which 
is catalysed by the nudix hydrolase NUDT5 (EC 
3.6.1.13) and blocked in RECON 1 due to the root no- 
production metabolite ADP-mannose (adpman) [36-38]. 
The product, mannose-l-phosphate (manlp), is con- 
verted to GDP-mannose (gdpman) from either glucose 
or mannose and is ultimately utilised in N-glycan bio- 
synthesis. A category I solution to ADPMAN was pro- 
posed by SMILEY. We found that, although the reversal 
of NUDT5 activity has not been reported, the conver- 
sion of manlp to adpman has been described in rats 
and cows (EC 2.7.7.28) [39]. Furthermore, the reaction 
MAN1PT2, catalysed by mannose-l-phosphate guanylyl- 
transferase (EC 2.7.7.13), is known to accept both ATP 
and ITP in addition to GTP although with reduced 
activity [40]. This implies that formation of adpman 
from manlp, which is in effect a reversal of EC 3.6.1.13 
activity, is plausible. The results suggest that reversing 
the directionality of blocked reactions to restore net- 
work flux can represent valid biological hypotheses 
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Figure 4 Blocked reactions have different SMILEY solutions depending upon their metabolic origin and were validated with 
experimental evidence. Blocked reactions were resolved by connecting their corresponding blocked metabolite back into RECON 1. A shows 
the number of blocked reactions resolved with each solution category and, directly below in B, the metabolic pathway distribution of the 
blocked reactions having that particular SMILEY solution. Some blocked reactions, such as those involved in amino acid metabolism, are easily 
bypassed using functionalities already described in the KEGG database. Others, such as those involved in glycan biosynthesis, can only be solved 
by transporting their causative dead-end metabolite out of the system. C) Proposed solutions were validated by comparing category I solutions 
to experimentally reported enzyme directionalities. Similarly, we investigated whether category II solutions were gene associated in humans and 
whether the dead-end metabolites, to which the category III solutions applied, have been detected in human biofluids. 
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enzyme catalysing such a reaction (EC 3.7.7.28) has been described in various mammals [39] along with the finding that mannose-1 -phosphate 
guanylyltransferase (EC 2.7.7.13) is known to be reversible and accept sugar donors other than GTP [40]. 



although the biological fate of the dead end metabolite 
remains unknown. 

In order to validate the category III solutions, we inves- 
tigated whether the metabolites to which the transport 
solutions applied have been detected in biofluids by 
querying the Human Metabolome Database [41]. The 67 
blocked reactions that had category III SI transport 
solutions were part of 24 reaction cascades. We found 
that 36 of the blocked reactions, part of eight reaction 
cascades, had blocked metabolites that have been 
detected in biofluids (Figure 4C). For example, we iden- 
tified a cascade of five blocked reactions involved in der- 
matan sulfate degradation in the lysosome. Dermatan 
sulfate degradation is modelled in RECON 1 by the 
degradation of a glycoseaminoglycan chain of a defined 
length (Figure 6). Following initial cleavage of the poly- 
saccharide from a serine residue at the reducing end of 
the core tetrasaccharide linkage by dermatan proteogly- 
can preotease, degradation occurs through a series of 
enzymatic steps involving desulfonation by sulfatases 
and subsequent hydrolysis of the glycosidic bonds con- 
necting the monosacharide building blocks, iduronic 
acid and N-galactosamine, by exoglycosidases [42,43]. 
We found that dermatan sulfate degradation was 
blocked due to iduronic acid buildup in the cytosol fol- 
lowing its cleavage from the dermatan polysaccharide in 
the reaction IDOAASE4ly (EC 3.2.1.76) and transport 
out of the lysosome in reaction IDOURtly [44-46]. The 
SMILEY solution for the blocked reaction IDOAASE4ly 
proposed transport of iduronic acid out of the cytosol in 
order to restore flux through the dermatan sulfate sub- 
system. These results suggest a transport function for 



iduronic acid out of the cytosol, which appears plausible 
as iduronic acid has been detected in urine (M. Fuller, 
personal communication; [43,47]. Iduronic acid was also 
identified as a root no-consumption metabolite in 
heparan sulfate degradation causing twelve blocked 
reactions. The transport solution for iduronic acid 
therefore resolves 18 blocked reactions occurring in gly- 
can degradation. We found no reports of cell surface 
iduronic acid transporters in the literature. Detailed sub- 
strate specificity studies for human hexose transporters, 
however, have not been performed [48]. 

As opposed to category I and III SMILEY solutions, 
category II solutions involve adding resolving reactions 
to RECON 1 to restore flux through the dead-end meta- 
bolites. Of the 45 blocked reactions with category II 
solutions we found that seven had solutions composed 
of resolving reactions associated with human genes. Of 
those, five were part of peroxisomal beta-oxidation of 
long chain unsaturated fatty acids (Figure 7). These 
blocked reactions were aU catalysed by 3-ketoacyl coen- 
zyme A thiolase (EC 2.3.1.16), an enzyme responsible 
for the last step of the four recurring steps of fatty acid 
P -oxidation [49]. The SMILEY solutions successfully 
coupled the root no-production metabolites of these 
reactions, involving 3-oxo fatty acid coenzyme A deriva- 
tives, to their corresponding fatty acid in three resolving 
reactions. For example, the blocked reaction ACACT4p 
that forms octanoyl coenzyme A (ocCoA) from the root 
no-production metabolite 3-oxodecanoyl-CoA (3odCoA) 
was coupled to decanoyl-CoA (dcaCoA) through its 
stepwise oxidation by trans-2-enoyl-CoA reductase (EC 
1.3.1.38), enoyl-CoA hydratase (EC 4.2.1.17), and 3- 
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Figure 6 Category III solution for five blocked reactions occurring in dermatan sulfate degradation. Iduronic acid (idour, boxed in red) 
was identified as a root no-consumption metabolite in dermatan sulfate degradation in the lysosome. Flux was restored through the blocked 
reactions (red arrows) by addition of an extracellular transport function for idour (green arrow). Review of the literature indicates that this SMILEY 
generated hypothesis is biologically plausible. See text for details. 



hydroxyacyl-CoA dehydrogenase (EC 1.1.1.35). Peroxiso- 
mal p -oxidation of unsaturated long chain fatty acids 
has not been completely elucidated in humans. How- 
ever, it has been shown to generate ocCoA which is 
then transported to the mitochondria where it can be 
further oxidised and coupled to ATP production [50] as 
suggested by SMILEY. Although these results do not 



add to current knowledge of metabolism as these gaps 
result from incomplete modelling in RECON 1 of P -oxi- 
dation in the peroxisome, they suggest that SMILEY is 
capable of generating biologically realistic results to gaps 
in RECON 1. 

The majority of category II solutions were composed 
of resolving reactions not associated with any known 
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Figure 7 Category II solution for blocked reactions in peroxisomal fatty acid degradation. The blocked reactions (red arrows) are all 
catalysed by acetyl-CoA-acyltransferase (EC 2.3.1.16). The SMILEY solutions (green arrows) for each of the five blocked reactions generated a 
complete degradation pathway for saturated fatty acids in the peroxisome starting with stearoyl-CoA (StCoa) and ending in octanoyl-CoA 
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acids [50,79]. 



genes (19 solutions) or resolving reactions associated 
with enzyme activities encoded for by non-human genes 
(19 solutions) (Figure 4C). These represented the most 
exciting SMILEY solutions as they have the potential to 
unearth novel human metabolic enzymes. Figure 8 
shows an example of category II solutions to three 
blocked reactions in urea metabolism. The SMILEY 
solution to AGPRim (EC 1.2.1.41), involves formation of 
its root no-production metabolite N-acetylglutamate-5- 
phosphate (acgSp) from N-acetylglutamate (acglu) and 
production of N-acetylornithine (acorn) from the root 
no-consumption metabolite N-acetylglutamate-5-semial- 
dehyde (acgSsa). Acglu and acorn are themselves root 
no-consumption and root no-production metabolites in 
the blocked reactions ACGSm (EC 2.3.1.1) and ACODA 
(EC 3.5.1.16) respectively. Interestingly, the two resol- 
ving reactions proposed for AGPRim, catalysed by acet- 
ylglutamate kinase (EC 2.7.2.8) and acetylornithine 
transaminase (EC 2.6.1.11), therefore restore flux 
through all three blocked reactions. These reactions are 
part of the arginine biosynthesis pathway in prokaryotes 
and plants. They have not been associated with human 
genes, as arginine biosynthesis in mammals has been 
shown to occur through glutamate rather than N-acetyl- 
glutamate [51,52]. Human enzymes with similar func- 
tionalities have however been described. For example, 
human pyrroline-5-carboxylate synthase (P5CS) encodes 
glutamate-5-semialdehyde dehydrogenase activity (EC 



1.2.1.41) and glutamate kinase activity (EC 2.7.2.11). 
P5CS is therefore a target candidate to catalyse a reac- 
tion where acg5sa is produced from acglu. Similarly, 
while acetylornithine transaminase activity (EC 2.6.1.11) 
has not been reported in humans, ornithine transami- 
nase activity (EC 2.6.1.13) has [53]. A literature search 
did not reveal whether human ornithine transaminase is 
capable of accepting acorn as a substrate. However, 
ornithine transaminase from Plasmodium vivax shows 
42-53% sequence similarity to eukaryotic ornithine 
transaminases and is known to accept both acorn and 
orn [54]. We suggest that following manual validation 
the hypotheses generated by SMILEY can guide experi- 
mental laboratory research. For selected blocked reac- 
tions, this is currently underway in our laboratory. 

Discussion 

The investigation of large complex systems on a global 
scale makes it impractical and maybe even impossible to 
know details about all involved metabolites, genes, and 
proteins. At the same time, a high level of knowledge 
about metabolic subsystems and or enzyme activities is 
necessary in order to come up with hypotheses of parti- 
cular metabolic fates and novel reactions. In this study 
we used a systems biology approach to characterise and 
fill gaps in human metabolism. The key results include 

i) many dead-end metabolites affect reaction cascades, 

ii) computationally predicted solutions require thorough 
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Figure 8 Category II solutions to three blocked reactions in urea metabolism. Flux was restored through the blocked reaction AGPRim by 
the addition of two resolving reactions (EC 2.7.2.8 and EC 2.6.1.11) which couple AGPRim to the blocked reactions ACGSm and ACODA through 
the consumption and production of their dead-end metabolites, N-acetylglutamate and N-acetylornithine, respectively. We found no evidence 
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conversion of the dead-end metabolites of these reactions by glutamate N-acetyltransferase (EC 2.3.1.35). As opposed to the solution for 
AGPRim, we did not find any indications that reactions similar to EC 2.3.1.35 are found in humans. 



manual curation and biochemical insight, and iii) four 
biological plausible hypotheses were identified. This 
work highlights that finding gene candidates for meta- 
bolic functions in the human genome is not a trivial 
issue and the extensive manual effort to curate the 



computational predictions of candidate reactions high- 
light the overall quality and quantity of data included in 
Recon 1, 

We characterised knowledge gaps of human metabo- 
lism, represented by blocked reactions and dead-end 



Rolfsson et al. BMC Systems Biology 201 1, 5:155 
http://www.biomedcentral.eom/1 752-0509/5/1 55 



Page 12 of 16 



metabolites identified in RECON 1 [11], to which solu- 
tions, in the form of non-organism specific metabolic 
reactions, could be found using the computer algorithm 
SMILEY [28] (Figure 1). We identified 175 blocked reac- 
tions, 70% of which had high confidence scores, and 
observed that while they were unevenly distributed 
within human metabolic pathways, most were found in 
the cytosol (Figure 2). Furthermore, we found that they 
arose due to different dead-end metabolite types that 
were in some cases responsible for up to 14 blocked 
reactions. These properties are likely to affect how trivial 
it will be to address these knowledge gaps experimen- 
tally and suggest that the impact of resolving these gaps, 
both in terms of novel metabolic discovery and their 
influence on RECON 1, will be different. For example, 
determining the fate of a metabolite, which results in 
multiple blocked reactions, will have a different impact 
on the biological accuracy of RECON 1 than resolving a 
single blocked reaction. Nevertheless, a single blocked 
reaction could be of great interest as a candidate target 
for novel metabolic discovery as its components could 
represent a drug target and resolving the gap could have 
unforeseen effects on network robustness, i.e., human 
metabolism. Also, assaying cytosolic reactions will be 
more straightforward than determining the function of 
compartmentalised reactions. Subsequently, which 
knowledge gaps are chosen for experimental research is 
ultimately a human decision depending on research 
goals, biological novelty factors, ease of experimental 
validation, and underlying evidence of the knowledge 
gap's validity. 

We highlighted four examples of missing knowledge 
in human metabolism (Figures 5, 6, 7 and 8) that 
resulted in biologically plausible hypotheses using a 
combined algorithmic and manual approach. The 
hypotheses were strengthened with published experi- 
mental data. In the case of iduronic acid (Figure 6), a 
major constituent of glycosamine glycans, we argued for 
a hypothesis that an extracellular transport reaction 
needs to be added to RECON 1. Although no direct evi- 
dence for such transport could be identified in the 
human genome, the existence of iduronic acid in human 
urine (M. Fuller, personal communication) suggests that 
a transporter may be a biologically plausible solution. 
Further evidence is that the build-up of iduronic acid in 
the lysosome has been linked to lysosomal storage disor- 
der caused by defects in the sialic acid lysosomal trans- 
porter [55]. Similarly, defects in a-L-iduronidase 
(3.2.1.76), the exo-glycohydrolase that cleaves iduronic 
acid off the non-reducing end of dermatan sulfate and 
heparan sulfate are known to cause a different type of 
lysosomal storage disorder, called mucopolysacharideosis 
I [43]. Despite its apparent involvement in disease, the 
metabolic fate of iduronic acid is unknown. The present 



work highlights knowledge gaps in human metabolic 
processes, such as the fate iduronic acid, which in the 
context of investigating lysosomal storage disorders due 
to protein deficiencies, have not been relevant but are 
now required to generate a complete picture of human 
metabolism. Our gap filling examples showed that algo- 
rithms, such as SMILEY, can be used to direct hypoth- 
eses of novel functions in human metabolism. 
Nevertheless, a semi-automated approach was required 
to assist with the identification of plausible gap filUng 
candidates for experimental verification. 

Multiple gap finding and gap filling algorithms exist, 
including GapFind/GapFill [20] and GrowMatch [29], 
and the use of alternative algorithms will undoubtedly 
increase the number of possible hypotheses as they 
employ different heuristics and data sources (e.g., uni- 
versal databases). This work does not provide a compre- 
hensive list of possible gap-filling reaction solutions but 
rather assesses the use of (semi)-automated computa- 
tional approaches for identifying and completing missing 
functions in human metabolism on a large-scale. We 
found that computational tools, such as SMILEY, do not 
necessarily suggest biologically plausible gap filling 
hypotheses. The generated hypotheses need to be evalu- 
ated in a manual, time-consuming manner, similar to 
the gap filing process employed during the reconstruc- 
tion approach [4,26]. The search for novel functions is 
therefore only semi-automated. Automated algorithms 
could however be trained, based on manual effort, to 
prioritize or exclude certain types of solutions. In addi- 
tion, approaches could be developed that incorporate 
methods to build hypotheses of genes associated with 
orphan reactions [56-60], which SMILEY does not 
directly do. 

Identification of genes associated with biological plau- 
sible hypotheses as suggested by SMILEY was a major 
challenge. Relatively few knowledge gaps were resolved 
using known metabolic functions (Figure 3), and the 
solutions required detailed literature review such that 
homology, of what were often prokaryotic genes/pro- 
teins to possible human counterparts, could be assessed. 
In light of our results, we believe that existing automatic 
gap filling approaches for uncovering gene function will 
be of limited use for mammals. This limitation arises 
from a lack of phylogenetic information, which is exten- 
sively explored for annotating microbial genomes 
[58,61]. Although various homology databases exist for 
mammalian genomes covering up to seventy mamma- 
lian species [62], the majority of phenotypic, genetic, 
and biochemical studies have been performed using 
mice, and to a lesser extent, human cells. Information 
derived from these databases therefore originates from 
few organisms making them less useful for annotation 
purposes. Furthermore, co-expression analysis is used in 
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microbes to determine genes with related function 
[63-65] and could serve as a strategy for gene finding in 
the human genome. However, analysis of regions of cor- 
related transcription (RCT) in human and mouse identi- 
fied both related and unrelated genes being co- 
expressed [66]. The majorities of RCT were not found 
in both human and mouse, which the authors explained 
with i) missing definition of homology and/or synteny, 
ii) no conserved pattern, and/or iii) physiological differ- 
ences between human and mice. This example high- 
lights the challenges associated with finding novel gene 
functions in the human genome using established meth- 
ods from the bacterial world. Novel approaches may 
include the use of protein-protein interaction data 
[67-69], tissue-specific information [70,71] and disease 
information [72] combined with gap filling algorithms. 
In particular, the latter work [72] observed a high degree 
of correlation between known co-occurring (co-morbid) 
diseases in patients and flux-coupling of the reactions 
that are perturbed in association with each of the dis- 
ease states. Flux coupled reaction sets [73], or perfectly 
coupled reaction sets (Co-sets) [74], have been calcu- 
lated in genome-scale metabolic models. Co-sets are 
often along linear pathways [74]. Thus, a low co-mor- 
bidity of two metabolically linked diseases would indi- 
cate a missing link along a Co-set, which would break 
the flux coupling by creating a pathway split. Similarly, 
single nucleotide polymorphisms have been mapped 
onto metabolic networks [75] and may be used for iden- 
tifying missing functions in human metabolism. 

Ubiquitous unknowns e.g. genes with unknown func- 
tion and orphan enzymes belonging to orthologous 
families, have been identified as top targets for func- 
tional elucidation in terms of biological knowledge pay- 
off as these are ancient in origin and therefore likely to 
be involved in essential metabolic processes [22,25,76]. 
We believe that combining a metabolic network 
approach with knowledge of ubiquitous unknowns could 
also represent an ideal method for organism specific 
novel function identification. 

Conclusions 

The results presented here show that RECON 1 allows 
the identification of specific metabolic pathways and 
reactions for which knowledge is lacking; thereby focus- 
ing the search for unknown metabolic functions by put- 
ting them into context with previously gathered 
metabolic information. Following identification and 
characterisation of RECON 1 network gaps, we showed 
that gap solution hypotheses can be generated automati- 
cally and successfully but require detailed, time consum- 
ing manual investigation in order to validate their 
biological plausibility. In this manner we have derived 
multiple hypotheses, which we intend to use to direct 



our knowledge-driven approach towards novel metabolic 
discoveries. 

Methods 

Pre-processing to gap identification 

RECON 1 was obtained from Duarte et al. [11]. It con- 
tains 356 dead-end metabolites causing knowledge gaps 
in eight cellular compartments. The dead-end metabo- 
lites in the compartmentalised model are not unique, as 
the same metabolite can cause blocked reactions in two 
or more cellular compartments. In order to address this 
issue, RECON 1 was decompartmentalised, by placing 
all intracellular compartment reactions in the cytosol 
and removing duplicates. Extra-organism located reac- 
tions were kept. Therefore, the decompartmentalised 
network accounts for these two compartments. Subse- 
quently, the number of dead-end metabolites was 
reduced to 145, as they were unique. Note that this 
modification affected the following gap analysis in that 
compartment specific gaps were not considered for gap 
filling; e.g., a reaction present in mitochondria but miss- 
ing in cytosol would not result in a gap in the decom- 
partmentalised network. All subsequent analysis was 
performed using the decompartmentalised version of 
RECON 1 (Recon_l_decomp). 

As a next step, all blocked reactions present in 
Recon_l_decomp were identified using flux variability 
analysis (EVA) [27] as reactions unable to carry flux 
defined by |Vmax, i | < 10'^ mmol/gdw/hr and |Vmin, i 
|< 10'^ mmol/gdw/hr for all i reactions in the network. 
All exchange reactions were unconstrained permitting 
free uptake and secretion of respective metabolites. A 
total of 285 reactions were identified. 

The SMILEY algorithm has been described previously 
by Reed et al. [28] and implemented in the COBRA 
toolbox v2.0 (Schellenberger et al, submitted). We 
downloaded the KEGG [31] Ligand database (as of 
1.10.2009), deemed U. Furthermore, we constructed a 
transport matrix, X, by defining a transport reaction 
from cytosol to extra-organism and an exchange reac- 
tion for each metabolite occurring Recon_l_decomp (S) 
and U (Figure 1). Matrix U and X served as reaction 
source for the SMILEY algorithm. Prior to the calcula- 
tion, metabolites from S were matched to U. Note that 
not all metabolites in S have a known KEGG ID and 
that subsequent solutions are sensitive to missing KEGG 
IDs, meaning that some possible resolving solution may 
have been missed in our simulation due to this short- 
coming. Current work has focused on adding more 
metabolite identifiers to RECON 1 (Thiele et al, in 
preparation). 

In the next step, each blocked reaction Vb, i was cho- 
sen as objective function, requiring the SMILEY algo- 
rithm to find reactions in U and/or X to be added to S 
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such that |vb, i| could carry a flux greater than 10' 
mmol/gdw/hr. SMILEY is designed to find the shortest 
possible solution consistent with this requirement [28]. 
For each blocked reaction, we computed the 20 shortest 
SMILEY solutions. Note that the number of solutions 
computed is user defined and in some cases 20 distinct 
solutions may not exist. For 175 out of 285 blocked 
reactions at least one SMILEY solution could be found. 

Gap analysis and validation of SMILEY output 

SMILEY generated a total of 1335 solutions for the 175 
blocked reactions. We filtered the solutions such that: i) 
only category II solutions with the least number of 
resolving reactions were investigated, and ii) if no cate- 
gory II solution was found, the solution involving the 
least number of resolving reactions was investigated. In 
cases where a blocked reaction had multiple solutions 
with the same number of resolving reactions, a random 
solution was picked. These criteria generated the SI 
SMILEY solution output that is reported in the results. 

Biochemical and genetic information concerning each 
blocked reaction identified by EVA was obtained from 
the Bigg database http://bigg.ucsd.edu/ [18]. The reaction 
specific information allowed blocked reactions to be 
grouped depending on their metabolic pathway, subsys- 
tem, and/or other reaction specific features described 
within the Bigg database. Similarly, biochemical and 
genetic information concerning resolving reactions pro- 
posed by SMILEY was obtained from the KEGG data- 
base http://www.genome.jp/kegg/ [31], which allowed 
organisation of the resolving reaction output. For resol- 
ving reactions where no human gene or protein infor- 
mation was directly available from Bigg or KEGG, blast 
homology searches of genes encoding the resolving reac- 
tion activity were performed against the human 
sequence databases on the NCBI website http://blast. 
ncbi.nlm.nih.gov/Blast.cgi and using the STRING data- 
base [77]. The localisation of dead-end metabolites in 
human biofluids was obtained from the human metabo- 
lome project http://www.hmdb.ca/ [41]. Resolving reac- 
tions were investigated individually by literature review 
in order to verify the biological relevance of the pro- 
posed SMILEY solutions and generate plausible hypoth- 
eses for gap filling of RECON 1. Reaction 
directionalities were compared with experimentally 
reported reaction directionalities in the Brenda database 
[34] and by quantitative assignment of reaction direc- 
tionality using the von Bertalanffy 1.0 algorithm [33] an 
extension available freely as part of the openCOBRA 
project [78]. The details of these calculations will be 
published in a separate manuscript. Briefly, experimen- 
tally determined or computed standard metabolite 
Gibbs energy transformed to cellular compartmental 
conditions with respect to in vivo pH (pH = 5.5-8.0), 



temperature (37°C), ionic strength (0.25 M) and electri- 
cal potential (-150 - 30 mV) was used to predict the 
upper and lower bounds on standard transformed reac- 
tion Gibbs energy. The upper and lower bounds are 
dependent upon in vivo metabolite concentration ranges 
which were set to 10'^ - 10'^ M. When the transformed 
reaction Gibbs energy range spans zero the reaction is 
predicted to be quantitatively reversible. Spontaneous 
reactions proposed by SMILEY were assessed in a simi- 
lar manner as enzyme catalysed reactions. All gap filling 
hypotheses can be found in Additional files 6 and 7. 
Reaction and sub-cellular compartment abbreviations 
are as described in the BiGG database. 

Additional material 



Additional file 1: Metabolic subsystem distribution of the 175 
blocked reactions. The distribution of the blocked reactions identified in 
RECON 1 within metabolic subsystems. 

Additional file 2: Dead-end metabolites can cause multiple blocked 
reactions. In the majority of cases, dead-end metabolites only cause one 
blocked reaction. When a dead-end metabolite is at the end or 
beginning of a reaction cascade however, it inhibits flux through all 
reactions, which are part of the reaction cascade. The figure shows the 
number of blocked reactions found in the reaction cascades. For 
example, there is one reaction cascade, which has 14 blocked reactions. 

Additional file 3: The SMILEY solution distribution is not dependent 
upon the metabolic pathway. Blocked reactions can have multiple 
SMILEY solutions independent of the metabolic pathway, of which the 
blocked reaction is a part. 

Additional file 4: Characterisation of alternative SMILEY solution. 

The figure shows the solution categories of the alternative SMILEY 
solutions to blocked reactions, which had a category I (A), category II (B) 
or a category III (C) SI solution. A) 30% of the blocked reactions with a 
category I SI solution had alternative category III solutions. The 
remaining 70% had either an alternative category I solution or none at 
all. B) 61% of the blocked reactions, which had category II SI solution, 
had alternative category II or III solutions. C) None of the blocked 
reactions, which had category. III SI solutions had alternative category 
solutions as expected. 

Additional file 5: The pathway distribution of blocked reactions and 
their SMILEY SI solutions categories. The figure shows the SMILEY SI 
solution category as a percentage of the total number of blocked 
reactions found within a particular metabolic pathway. The number of 
blocked reactions within each metabolic pathway and their SMILEY 
solution type is also shown. Some blocked reactions, such as those 
involved in amino acid metabolism, are easily bypassed using 
functionalities already described in the KEGG, represented by category I 
and II SMILEY solutions. Others, such as those involved in glycan 
biosynthesis, can only be solved by transport of their causative dead-end 
metabolite out of the system. 

Additional file 6: The filtered SI SMILEY output. The table contains 
the filtered SMILEY SI output as described in materials and methods. The 
identified blocked reaction is given in column A. Columns B-J report 
properties of the blocked reactions i.e., causative dead end metabolites, 
BiGG confidence scores, cellular compartment, whether or not the 
blocked reaction is in a cascade, blocked reaction components, 
associated genes, the metabolic subsystem and the metabolic pathway 
the blocked reaction occurs in. Columns K-0 report properties of the SI 
solution i.e. the solution category, the experimentally reported 
directionality and computed directionalities of blocked reactions having 
category I reversal solutions, the number of resolving reactions 
composing the solution and the KEGG identifiers of each resolving 
reaction. The remaining columns report properties of each individual 
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resolving reaction comprising the SMILEY solution i.e. KEGG identifiers, 
resolving reaction components, metabolic pathway and subsystem and 
any associated enzyme commission numbers or known human genes. 

Additional file 7: The complete SMILEY output. The table contains 
the complete SMILEY output with up to twenty solutions for each 
identified blocked reaction. The identified blocked reaction is given in 
column A. In columns B-D the properties of the blocked reaction are 
reported (reaction components, gene and metabolic subsystem). In 
column E the resolving reactions comprising the complete SMILEY 
solution for the blocked reaction is reported with KEGG reaction 
identifiers. In the following columns the properties of the resolving 
reactions are reported (reaction components, reaction subsystem, 
enzyme commission number and gene). 
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